Effective Machine Learning Garbage Data Filtering Algorithm for SNS Big Data Processing

نویسندگان

چکیده

Social network services (SNS) are used more often today, which results in SNS data being generated. Furthermore, greater emphasis is placed on extracting various sorts of information through the collection, processing, and analysis massive volumes data. Although big processing can extract a lot from data, it takes long time resources. As result, gaining insights necessitates significant investment money. In this section, we propose filtering approach for removing unnecessary stream. To improve accuracy, suggested method employs Random Forest, Decision Tree, XGBoost. Research shows that algorithm filters experimental keywords by than 70%.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A survey of machine learning for big data processing

There is no doubt that big data are now rapidly expanding in all science and engineering domains. While the potential of these massive data is undoubtedly significant, fully making sense of them requires new ways of thinking and novel learning techniques to address the various challenges. In this paper, we present a literature survey of the latest advances in researches on machine learning for ...

متن کامل

Automated Machine Learning on Big Data using Stochastic Algorithm Tuning

We introduce a means of automating machine learning (ML) for big data tasks, by performing scalable stochastic Bayesian optimisation of ML algorithm parameters and hyper-parameters. More often than not, the critical tuning of ML algorithm parameters has relied on domain expertise from experts, along with laborious handtuning, brute search or lengthy sampling runs. Against this background, Bayes...

متن کامل

Big Data Analytic and Mining with Machine Learning Algorithm

Big Data concern large-volume, complex, growing data sets with multiple, autonomous sources. With the fast development of networking, data storage, and the data collection capacity, Big Data are now rapidly expanding in all science and engineering domains, including physical, biological and biomedical sciences. This datadriven model involves demand-driven aggregation of information sources, min...

متن کامل

Machine Learning Models for Housing Prices Forecasting using Registration Data

This article has been compiled to identify the best model of housing price forecasting using machine learning methods with maximum accuracy and minimum error. Five important machine learning algorithms are used to predict housing prices, including Nearest Neighbor Regression Algorithm (KNNR), Support Vector Regression Algorithm (SVR), Random Forest Regression Algorithm (RFR), Extreme Gradient B...

متن کامل

How big data changes statistical machine learning

This presentation illustrates how big data forces change on algorithmic techniques and the goals of machine learning, bringing along challenges and opportunities. 1. The theoretical foundations of statistical machine learning traditionally assume that training data is scarce. If one assumes instead that data is abundant and that the bottleneck is the computation time, stochastic algorithms with...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: E3S web of conferences

سال: 2023

ISSN: ['2555-0403', '2267-1242']

DOI: https://doi.org/10.1051/e3sconf/202339101056